A Framework for Learning Morphology using Suffix Association Matrix

نویسندگان

  • Shilpa Desai
  • Jyoti Pawar
  • Pushpak Bhattacharyya
چکیده

Unsupervised learning of morphology is used for automatic affix identification, morphological segmentation of words and generating paradigms which give a list of all affixes that can be combined with a list of stems. Various unsupervised approaches are used to segment words into stem and suffix. Most unsupervised methods used to learn morphology assume that suffixes occur frequently in a corpus. We have observed that for morphologically rich Indian Languages like Konkani, 31 percent of suffixes are not frequent. In this paper we report our framework for Unsupervised Morphology Learner which works for less frequent suffixes. Less frequent suffixes can be identified using p-similar technique which has been used for suffix identification, but cannot be used for segmentation of short stem words. Using proposed Suffix Association Matrix, our Unsupervised Morphology Learner can also do segmentation of short stem words correctly. We tested our framework to learn derivational morphology for English and two Indian languages, namely Hindi and Konkani. Compared to other similar techniques used for segmentation, there was an improvement in the precision and recall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On-Line Cumulative Learning of Hierarchical Sparse n-grams

We present a system for on-line, cumulative learning of hierarchical collections of frequent patterns from unsegmented data streams. Such learning is critical for long-lived intelligent agents in complex worlds. Learned patterns enable prediction of unseen data and serve as building blocks for higher-level knowledge representation. We introduce a novel sparse n-gram model that, unlike pruned n-...

متن کامل

The morpho-phonological interface in Specific Language Impairment

This thesis investigates the nature of the interface between two components of language morphology and phonology in children with Grammatical-Specific Language Impairment (G-SLi), compared to those with typically-developing language. I focus principally on the impact of phonological complexity on past tense inflection, but I also investigate other areas of morphology. More specifically, I show ...

متن کامل

Brain signatures of early lexical and morphological learning of a new language.

Morphology is an important part of language processing but little is known about how adult second language learners acquire morphological rules. Using a word-picture associative learning task, we have previously shown that a brief exposure to novel words with embedded morphological structure (suffix for natural gender) is enough for language learners to acquire the hidden morphological rule. He...

متن کامل

Toward a Unified Framework for Inference of Hidden State under Partial Observability

While various techniques exist for learning in partially observable environments such as POMDPs, there has yet to emerge a unified theory that frames the problem in such a way as to explain the fundamental issues, tradeoffs, and approximations involved. This work is a first step in that direction, providing a unifying framework for hidden state inference in deterministic finite POMDPs. We prese...

متن کامل

Unsupervised Learning of Morphology for English and Inuktitut

We describe a simple unsupervised technique for learning morphology by identifying hubs in an automaton. For our purposes, a hub is a node in a graph with in-degree greater than one and out-degree greater than one. We create a word-trie, transform it into a minimal DFA, then identify hubs. Those hubs mark the boundary between root and suffix, achieving similar performance to more complex mixtur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014